378 research outputs found

    The development of computational methods for large-scale comparisons and analyses of genome evolution

    Get PDF
    The last four decades have seen the development of a number of experimental methods for the deduction of the whole genome sequences of an ever-increasing number of organisms. These sequences have in the first instance, allowed their investigators the opportunity to examine the molecular primary structure of areas of scientific interest, but with the increased sampling of organisms across the phylogenetic tree and the improved quality and coverage of genome sequences and their associated annotations, the opportunity to undertake detailed comparisons both within and between taxonomic groups has presented itself. The work described in this thesis details the application of comparative bioinformatics analyses on inter- and intra-genomic datasets, to elucidate those genomic changes, which may underlie organismal adaptations and contribute to changes in the complexity of genome content and structure over time. The results contained herein demonstrate the power and flexibility of the comparative approach, utilising whole genome data, to elucidate the answers to some of the most pressing questions in the biological sciences today.As the volume of genomic data increases, both as a result of increased sampling of the tree of life and due to an increase in the quality and throughput of the sequencing methods, it has become clear that there is a necessity for computational analyses of these data. Manual analysis of this volume of data, which can extend beyond petabytes of storage space, is now impossible. Automated computational pipelines are therefore required to retrieve, categorise and analyse these data. Chapter two discusses the development of a computational pipeline named the Genome Comparison and Analysis Toolkit (GCAT). The pipeline was developed using the Perl programming language and is tightly integrated with the Ensembl Perl API allowing for the retrieval and analyses of their rich genomic resources. In the first instance the pipeline was tested for its robustness by retrieving and describing various components of genomic architecture across a number of taxonomic groups. Additionally, the need for programmatically independent means of accessing data and in particular the need for Semantic Web based protocols and tools for the sharing of genomics resources is highlighted. This is not just for the requirements of researchers, but for improved communication and sharing between computational infrastructure. A prototype Ensembl REST web service was developed in collaboration with the European Bioinformatics Institute (EBI) to provide a means of accessing Ensembl’s genomic data without having to rely on their Perl API. A comparison of the runtime and memory usage of the Ensembl Perl API and prototype REST API were made relative to baseline raw SQL queries, which highlights the overheads inherent in building wrappers around the SQL queries. Differences in the efficiency of the approaches were highlighted, and the importance of investing in the development of Semantic Web technologies as a tool to improve access to data for the wider scientific community are discussed.Data highlighted in chapter two led to the identification of relative differences in the intron structure of a number of organisms including teleost fish. Chapter three encompasses a published, peer-reviewed study. Inter-genomic comparisons were undertaken utilising the 5 available teleost genome sequences in order to examine and describe their intron content. The number and sizes of introns were compared across these fish and a frequency distribution of intron size was produced that identified a novel expansion in the Zebrafish lineage of introns in the size range of approximately 500-2,000 bp. Further hypothesis driven analyses of the introns across the whole distribution of intron sizes identified that the majority, but not all of the introns were largely comprised of repetitive elements. It was concluded that the introns in the Zebrafish peak were likely the result of an ancient expansion of repetitive elements that had since degraded beyond the ability of computational algorithms to identify them. Additional sampling throughout the teleost fish lineage will allow for more focused phylogenetically driven analyses to be undertaken in the future.In chapter four phylogenetic comparative analyses of gene duplications were undertaken across primate and rodent taxonomic groups with the intention of identifying significantly expanded or contracted gene families. Changes in the size of gene families may indicate adaptive evolution. A larger number of expansions, relative to time since common ancestor, were identified in the branch leading to modern humans than in any other primate species. Due to the unique nature of the human data in terms of quantity and quality of annotation, additional analyses were undertaken to determine whether the expansions were methodological artefacts or real biological changes. Novel approaches were developed to test the validity of the data including comparisons to other highly annotated genomes. No similar expansion was seen in mouse when comparing with rodent data, though, as assemblies and annotations were updated, there were differences in the number of significant changes, which brings into question the reliability of the underlying assembly and annotation data. This emphasises the importance of an understanding that computational predictions, in the absence of supporting evidence, may be unlikely to represent the actual genomic structure, and instead be more an artefact of the software parameter space. In particular, significant shortcomings are highlighted due to the assumptions and parameters of the models used by the CAFE gene family analysis software. We must bear in mind that genome assemblies and annotations are hypotheses that themselves need to be questioned and subjected to robust controls to increase the confidence in any conclusions that can be drawn from them.In addition functional genomics analyses were undertaken to identify the role of significantly changed genes and gene families in primates, testing against a hypothesis that would see the majority of changes involving immune, sensory or reproductive genes. Gene Ontology (GO) annotations were retrieved for these data, which enabled highlighting the broad GO groupings and more specific functional classifications of these data. The results showed that the majority of gene expansions were in families that may have arisen due to adaptation, or were maintained due to their necessary involvement in developmental and metabolic processes. Comparisons were made to previously published studies to determine whether the Ensembl functional annotations were supported by the de-novo analyses undertaken in those studies. The majority were not, with only a small number of previously identified functional annotations being present in the most recent Ensembl releases.The impact of gene family evolution on intron evolution was explored in chapter five, by analysing gene family data and intron characteristics across the genomes of 61 vertebrate species. General descriptive statistics and visualisations were produced, along with tests for correlation between change in gene family size and the number, size and density of their associated introns. There was shown to be very little impact of change in gene family size on the underlying intron evolution. Other, non-family effects were therefore considered. These analyses showed that introns were restricted to euchromatic regions, with heterochromatic regions such as the centromeres and telomeres being largely devoid of any such features. A greater involvement of spatial mechanisms such as recombination, GC-bias across GC-rich isochores and biased gene conversion was thus proposed to play more of a role, though depending largely on population genetic and life history traits of the organisms involved. Additional population level sequencing and comparative analyses across a divergent group of species with available recombination maps and life history data would be a useful future direction in understanding the processes involved

    Cold Collision Frequency Shift of the 1S-2S Transition in Hydrogen

    Get PDF
    We have observed the cold collision frequency shift of the 1S-2S transition in trapped spin-polarized atomic hydrogen. We find Δν1S−2S=−3.8(8)×10−10nHzcm3\Delta \nu_{1S-2S} = -3.8(8)\times 10^{-10} n Hz cm^3, where nn is the sample density. From this we derive the 1S-2S s-wave triplet scattering length, a1S−2S=−1.4(3)a_{1S-2S}=-1.4(3) nm, which is in fair agreement with a recent calculation. The shift provides a valuable probe of the distribution of densities in a trapped sample.Comment: Accepted for publication in PRL, 9 pages, 4 PostScript figures, ReVTeX. Updated connection of our measurement to theoretical wor

    Lessons from the evaluation of the UK's NHS R&D Implementation Methods Programme

    Get PDF
    Background: Concern about the effective use of research was a major factor behind the creation of the NHS R&D Programme in 1991. In 1994, an advisory group was established to identify research priorities in research implementation. The Implementation Methods Programme (IMP) flowed from this, and its commissioning group funded 36 projects. In 2000 responsibility for the programme passed to the National Co-ordinating Centre for NHS Service Delivery and Organisation R&D, which asked the Health Economics Research Group (HERG), Brunel University, to conduct an evaluation in 2002. By then most projects had been completed. This evaluation was intended to cover: the quality of outputs, lessons to be learnt about the communication strategy and the commissioning process, and the benefits from the projects. Methods: We adopted a wide range of quantitative and qualitative methods. They included: documentary analysis, interviews with key actors, questionnaires to the funded lead researchers, questionnaires to potential users, and desk analysis. Results: Quantitative assessment of outputs and dissemination revealed that the IMP funded useful research projects, some of which had considerable impact against the various categories in the HERG payback model, such as publications, further research, research training, impact on health policy, and clinical practice. Qualitative findings from interviews with advisory and commissioning group members indicated that when the IMP was established, implementation research was a relatively unexplored field. This was reflected in the understanding brought to their roles by members of the advisory and commissioning groups, in the way priorities for research were chosen and developed, and in how the research projects were commissioned. The ideological and methodological debates associated with these decisions have continued among those working in this field. The need for an effective communication strategy for the programme as a whole was particularly important. However, such a strategy was never developed, making it difficult to establish the general influence of the IMP as a programme. Conclusion: Our findings about the impact of the work funded, and the difficulties faced by those developing the IMP, have implications for the development of strategic programmes of research in general, as well as for the development of more effective research in this field

    A blind accuracy assessment of computer-modeled forensic facial reconstruction using computed tomography data from live subjects.

    Get PDF
    A computer modeling system for facial reconstruction has been developed that employs a touch-based application to create anatomically accurate facial models focusing on skeletal detail. This article discusses the advantages and disadvantages of the system and illustrates its accuracy and reliability with a blind study using computed tomography (CT) data of living individuals. Three-dimensional models of the skulls of two white North American adults (one male, one female) were imported into the computer system. Facial reconstructions were produced by two practitioners following the Manchester method. Two posters were produced, each including a face pool of five surface model images and the facial reconstruction. The face pool related to the sex, age, and ethnic group of the target individual and included the surface model image of the target individual. Fifty-two volunteers were asked to choose the face from the face pool that most resembled each reconstruction. Both reconstructions received majority percentage hit rates that were at least 50% greater than any other face in the pool. The combined percentage hit rate was 50% above chance (70%). A quantitative comparison of the facial morphology between the facial reconstructions and the CT scan models of the subjects was carried out using Rapidform(â„¢) 2004 PP2-RF4. The majority of the surfaces of the facial reconstructions showed less than 2.5 mm error and 90% of the male face and 75% of the female face showed less than 5 mm error. Many of the differences between the facial reconstructions and the facial scans were probably the result of positional effects caused during the CT scanning procedure, especially on the female subject who had a fatter face than the male subject. The areas of most facial reconstruction error were at the ears and nasal tip

    Genome Sequence of Erythromelalgia-Related Poxvirus Identifies it as an Ectromelia Virus Strain

    Get PDF
    Erythromelagia is a condition characterized by attacks of burning pain and inflammation in the extremeties. An epidemic form of this syndrome occurs in secondary students in rural China and a virus referred to as erythromelalgia-associated poxvirus (ERPV) was reported to have been recovered from throat swabs in 1987. Studies performed at the time suggested that ERPV belongs to the orthopoxvirus genus and has similarities with ectromelia virus, the causative agent of mousepox. We have determined the complete genome sequence of ERPV and demonstrated that it has 99.8% identity to the Naval strain of ectromelia virus and a slighly lower identity to the Moscow strain. Small DNA deletions in the Naval genome that are absent from ERPV may suggest that the sequenced strain of Naval was not the immediate progenitor of ERPV

    Factors associated with completion of bowel cancer screening and the potential effects of simplifying the screening test algorithm

    Get PDF
    BACKGROUND: The primary colorectal cancer screening test in England is a guaiac faecal occult blood test (gFOBt). The NHS Bowel Cancer Screening Programme (BCSP) interprets tests on six samples on up to three test kits to determine a definitive positive or negative result. However, the test algorithm fails to achieve a definitive result for a significant number of participants because they do not comply with the programme requirements. This study identifies factors associated with failed compliance and modifications to the screening algorithm that will improve the clinical effectiveness of the screening programme. METHODS: The BCSP Southern Hub data for screening episodes started in 2006–2012 were analysed for participants aged 60–69 years. The variables included age, sex, level of deprivation, gFOBt results and clinical outcome. RESULTS: The data set included 1 409 335 screening episodes; 95.08% of participants had a definitively normal result on kit 1 (no positive spots). Among participants asked to complete a second or third gFOBt, 5.10% and 4.65%, respectively, failed to return a valid kit. Among participants referred for follow up, 13.80% did not comply. Older age was associated with compliance at repeat testing, but non-compliance at follow up. Increasing levels of deprivation were associated with non-compliance at repeat testing and follow up. Modelling a reduction in the threshold for immediate referral led to a small increase in completion of the screening pathway. CONCLUSIONS: Reducing the number of positive spots required on the first gFOBt kit for referral for follow-up and targeted measures to improve compliance with follow-up may improve completion of the screening pathway
    • …
    corecore